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"Modem" statistics may generate more replieable charaeterizations of data, because at least in some 
respeets the influenees of more extreme and less representative scores are minimized. The present 
paper explains both trimmed and winsorized statistics, and uses a mini-Monte Carlo demonstration 
of the robust regression analysis as well as deseribes other eomputer-intensive methods sueh as 
jaekknifmg and bootstrapping. 

Almost everyone knows about the average or the mean even if they have not heard of any 
other eoneept in statistics. Mean is the very first statistieal eoncept taught in sehools, and 
students are told that it is a great tool to represent a whole buneh of numbers. Praetieally 
speaking, mean is easy to eompute, and its ealculation helps students’ arithmetic skills improve. 
However, is it really tme that the mean is the greatest tool to represent any set of numbers? 

Comparison of mean and median in terms of their robustness: 

Turkish people have all the right to be proud of their eountryman Hedo, not only beeause he is 
one of the very few European players in NBA, but also beeause of his progress as a basketball 
player in 2008-2009 season. A skeptieal PhD student from Turkey wanted to see if Hedo really 
deserved the appreeiation of 70 million Turkish people. In-between his heetic sehedule of 
statistics courses, he eould only find enough time to wateh one game per season, and decided to 
reeord the number of points Hedo seored against the NBA ehampion of the previous year. He 
assumed that would be a good estimate of his performanee during that parti eular year. He formed 
the following table and started to wait Orlando Magie game against Boston Celties in 2008-2009 



season. 
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Table 1. Hedo’s score summary between 1999-2008 



Season 


Score against Champ. 


Number of Years in NBA 


1999-2000 


8 


1 


2000-2001 


9 


2 


2001-2002 


10 


3 


2002-2003 


10 


4 


2003-2004 


11 


5 


2004-2005 


12 


6 


2005-2006 


12 


7 


2006-2007 


13 


8 


2007-2008 


14 


9 



The busy PhD student believing the mean would be a great way of representing Hedo’s 
performance as a basketball player, decided to report only Hedo’s statistics which are the 



moments about the mean. 
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Table 2. Moments about the mean statisties values for Hedo sample between 1999-2008 
Statistie Value 

1 n 

SD^ 1.94 

Skewness 0 

Kurtosis -0.8 

Pearson r 

0.99 

in relation to experienee as years 



Over the past 9 years, he prepared different tables with similar numbers filling it. With no 
knowledge of how well Hedo has been playing so far this season, our skeptieal PhD student, with 
eonsideration of the Pearson r score being almost equal to 1, expected Hedo to have a seore of 15 
or so during the game against Celties. However, Hedo seored just 45 points whieh will probably 
lead him to be the seeond Turkish player in an All-Star game. Our PhD student had to start over 



from scrateh to form a new table of statisties. 
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Table 3. Moments about the mean statisties values for Hedo sample between 1999-2009 



Statistie 



Value 



X 14.4 

SD^ 10.91 

Skewness 3 

Kurtosis 9.24 

Pearson r 

0.66 



in relation to experienee as years 



Later on, he wondered on what statistic Hedo’s latest performances had the greatest effect. So, he 
ealeulated the percent change for X and SDy. as 25% and 462%, respectively. Apparently, the 
latest performanee of Hedo had an increasing effect on the increasing moments about the mean. 
Although our PhD student didn’t have the opportunity to test his hypothesis for the third moment 
about the mean (sinee Skewness in the first data was 0), he extended his percent ehange 
ealeulations to the fourth moment about the mean, and found out that there was a 1569% ehange 
in the Kurtosis value. He thought perhaps it was time to update his knowledge, since he has been 
in graduate sehool for 10 years now, and deeided to read more about modern statistieal methods. 

However, not only modem statistieal methods are robust statisties. The sample median 
could indeed have been a great ehoiee to estimate Hedo’s overall performanee as an NBA player. 
The median seores before and after this year’s game were 1 1 and 1 1.5, respeetively. These 
values were not only very close to each other but also similar to the mean of Hedo’s scores 
before this year’s game. Hedo’s performanee at this year’s game was definitely an outlier (Was it 
really? How would we know this mathematieally?), but that had little effect on the sample 
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median on the contrary to its effect on all the moments about the mean. Two questions emerge at 
this point: 

1) Why is median a robust procedure whereas the mean isn’t? 



The answer to the first question is trivial, beeause when ealeulating the mean, one has to eonsider 
the effects of each and every data point including the outliers, whereas the median disregards the 
aetual data and eoneerned with the positions, instead. 



lim X — lim 



Xl+X2+---^Tl-l+^Tl 

n 



( 1 ) 



As you see from the formula of the mean, when all but are fixed, and when goes to 
infinity, the x approaehes to infinity. Shortly, one bizarre x value can change the sample mean 
dramatieally. One of the many ways to determine whether a statistie is robust or not, is to look at 
the shape of the influence funetion to describe the effect of one outlier (Reid, 2006) The 
infiuenee funetion is aetually the derivative of a functional, and it measures the effeet of a small 
perturbation on a eertain statistic. Read Wilcox (2005) for a comparison of the influence 
functions of some robust statisties. Another method to test for robustness is to simulate the 
contaminated data sets and see how well the estimator does (Rousseeuw,1991). We will try this 
with a Monte Carlo simulation in this paper. The easiest method however, is the tool named 
breakdown point. As defined by Donoho & Huber (1983), the breakdown point is the smallest 
amount of eontamination that may cause an estimator to take large bizarre values. The 
breakdown for the average is 1 1% (1 in 9) for Hedo example sinee one outlier was enough for 
the average to make the nature of the data abnormal. However, the breakdown point for the 
median is 50%, and that equals to at least five seores (five outliers) that are needed to divert the 
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value of the median. For infinitely large number of data points, breakdown point for the mean 
approaches to the lim 1/n =0, whereas the lim (n-l)/2n stays 0.5 for the median. 

2) How can an outlier be detected mathematically? 

An effective method to answer the second question is called jackknife algorithm (Thompson, 
2008) In this method, after the initial caleulation of the sample statistie, one case in turn is 
dropped, and the statistic is recalculated for the new subset. Some additional computations 
provide a confidenee interval for the statistic. For example, the table shows the Lower and Upper 
confidence intervals around the mean calculated by an Excel add-in for Hedo’s points as they 
were given earlier. 

Table 4. Jackknife Statistics for Hedo sample between 1999-2008 



Jackknife Statistics 



mean 


14.40 


std. error 


3.45 


alpha 


0.05 


t 


1.83 


LCL 


8 


UCL 


21 



Any score not in the range of 8 to 21 is considered as an outlier for Hedo case. 

Robust statistics on multivariate statistics: 

Let’s continue our discussion on robust statistics with multivariate statistics. The following 
scatter graphs show the linear relationship between the number of years of experience of Hedo in 
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NBA, and his performance in terms of baskets he seored at a eertain game. It is elear from the 
figure 1 and figure 2 that the outlier (seore in 2008-2009 was 45 points) greatly affected the line 
of best fit. Comparing values of both sets tells us that there was approximately 50% deerease 
in the linearity (0.99^ — 0.66^= 0.55). We are now done with Hedo for now. 





Figure 1; Hedo Seattergram Case 1 Figure 2; Hedo Seattergram Case 2 



Is there any other regression method that is more robust than the method of least squares 



(LS)? 

Rousseeuw (1984) suggests minimizing the median of squared residuals (LMS) instead of trying 
to minimize the sum of squared residuals would produee more robust estimation. After all, as we 
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explained earlier the median is more robust to change compared to the mean, thus compared to 
the sum. Given the data generating process Yi = Pq + + £-uSl 10,000 times iterated Monte 

Carlo simulation of the slopes of some other 1 1 samples with parameter values Pq=\ (y- 
intercept). Pi = 5 (slope), and a=10 and an outlier factor of 100 added to the 1 1* sample 
produced the following two figures for LS, and LMS, respectively. 



Empirical Histogram for 10,000 Repetitions 




0 5 10 15 

slope estimates 



Empirical Histogram for 10,000 Repetitions 




-10 0 10 20 
slope estimates 



Figure 3: Least squares regression 



Figure 4: Least Median of Squares 



However, although LMS estimates the slopes better, it is way less precise than the LS. The 
following two figures show this fact with a Monte Carlo simulation of slopes for clean and 
contaminated data with the same parameters, i.e.. Pi = 5. 




-5 0 5 10 15 

Slope estimates for clean data 




-10 0 10 20 
Slope estimates for dirty data 



Figure 5. Comparison of LS and LMS on 
clean data 



Figure 6. Comparison of LS and LMS on 
clean data 
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The Monte Carlo simulations reveal the faet that ordinary least squares is not a robust estimator, 
that it is weak against extreme values of data. On the other hand, LMS is not mueh affeeted by 
the outliers; however it suffers from low preeision. 

Couldn’t we just get rid of the outlier data instead going through the eomplieated proeess of 
LMS? In faet, another robust estimator ealled Least Trimmed Squares (LTS) follow this method. 
Instead of working on LTS, we will leave this to the reader, and go back to univariate data to talk 
more about trimming. 

Trimming and Winsorizing 

Hedo is not the only Turkish basketball player in NBA, nor is he the first player to play an All- 
Star game. Mehmet Okur, aka Memo has been selected as an All-Star player in 2007-2008. A 
random sample of the points he scored in 10 games throughout his eareer in NBA are 22, 22, 28, 
27, 19, 2, 19, 88, 100, and 1. Some of the seores are more likely than the others to be seleeted 
from a pre-assumed normal distribution of Memo’s seores and some of these scores appear to be 
so bizarre that they do not belong to the Memo’s seore population with a normal distribution. We 
speeulate the reasons for the bizarreness as 

1) Inaeeurate data entry, beeause nba.eom is maintained by a eareless webmaster 

2) After looking at the details of the game, we figured out that Memo was injured and left 
some games early. 

Otherwise, without being able to explain the reasons for the bizarre data, we should merely 
eonelude that Memo performed within his potential, and should keep the data as they are. This 
leads us to treat the outliers as other data, and not diseriminate against them. 

After finding a good valid reason to treat outliers as data that eontaminated Memo’s overall 
performanee, we sort the data in an inereasing fashion, and trim the 20% of the data from both 
ends. This leaves us only 60% of the original sample, whieh ineludes 19, 19, 22, 22, 27, and 28 
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as our new trimmed sample. Wilcox (1997) suggests 20% as the best amount of trimming for 
general purposes. Considering the reasons behind the bizarreness of our data, we keep his adviee 
in this ease. The main advantage of using a trimmed mean is that we do not risk the normality 
presumption, beeause if Memo’s seores throughout his entire eareer are normally distributed, the 
trimmed mean will be equal to the mean. (Erceg-Hun & Mirosevieh, 2008) 

What happens if we deeide to trim 50% of the odd-numbered sample? We are then left with the 
median only, as our trimmed mean. The other extreme trimming pereent merely gives us the 
sample mean with 0% trimming. 

Alternatively, we can replace the trimmed samples with the first and last samples in our trimmed 
sample. Memo’s winsorized sample beeomes 19, 19, 19, 19, 22, 22, 27, 28, 28, and 28. By doing 
so, we are paying more attention to those seores near the eentre by giving less importance to the 
extremes. The following table eompares some statisties of our original sample, trimmed sample, 
and winsorized sample. 

Table 5. Comparison of robustness methods in moments about the mean statisties for Memo’s 
seores 



Original Sample 


20% Trimmed 


Sample 


20% Winsorized 


Sample 


Statistie 


Value 


Statistie 


Value 


Statistie 


Value 


X 


32.8 


X 


22.83 


X 


22.5 


Median 


22 


Median 


22 


Median 


22 


SDx 


33.65 


SDx 


3.87 


SDx 


4.18 


Skewness 


1.47 


Skewness 


0.5 


Skewness 


0.26 


Kurtosis 


1.05 


Kurtosis 


-1.73 


Kurtosis 


-2.12 


Range 


99 


Range 


9 


Range 


9 
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In the case of ehoosing our trim pereentage to 25%, the range automatically becomes IQR (inter- 
quartile range), whieh is the differenee between 75**^ and 25* pereentiles of the data set. This is 
given as a robust dispersion deseriptive statisties in Thompson (2008). 

Bootstrapping 

Bootstrapping is a resampling method performed by the help of a eomputer software, sueh as an 
Exeel add-in. Consider Memo’s sample of seores: 22, 22, 28, 27, 19, 2, 19, 88, 100, and 1 with a 
sample mean of 32.8. An Exeel add-in replaees the original sample with a randomly seleeted 12 
observations. One random sample generated by bootstrapping might be 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1 with a mean of 1 or another might be 100 repeated 12 times. This proeess of generating 
bootstrap samples from the original sample is repeated thousands of times, to get a better 
approximation of the sampling distribution of the mean (or any other statistic). With 
bootstrapping method p-value ealeulations, hypothesis, or even forming eonfidenee intervals 
around the mean or the effeet sizes beeome more aeeurate and replieable even if the assumptions 
around the population is violated (Ereeg-Hun & Mirosevieh, 2008). Bootstrapping is used to get 
a more robust estimation of sampling distribution than the theoretieal distributions. This is most 
helpful when the population is not assumed to be normal or homogenous in variance. 

Ereeg-Hun & Mirosevieh (2008) reports the normality assumption is rarely met in practice, 
supporting Mieeeri (1989) who found out that real data are more likely to resemble an 
exponential eurve rather than a normal. This causes many studies wrongly rejeeting the null 
hypothesis or reporting low power values. Besides normality, there is another issue whieh is 
pretty important even if the population is rightly assumed to be normal. It is the existenee of 
differenees between random variables. Wileox (1998) elearly mentions that a small diversion 
from the homogeneity of varianee prineiple may cause a dramatie low power value, even if the 
sample sizes are equal. He rightly asks the question, “How many diseoveries have been lost by 
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ignoring modern statistical methods?” (Wilcox, 1998, p.l) and states that many nonsignificant 
results have been found to be significant because modem statistical methods were ignored. 

Computer Software 

R is a statistical software package, which is free and able to do all the procedures described in 
this paper. ZumaStat claims to be more user-friendly than R, and uses SPSS and Excel for robust 
analysis. Other than these two, several excel add-in’s are available on the Internet, and are 
available through my webpage at http://people.tamu.edu/~sencer, where a small description and 
assessment in terms of their ease of use is included. 

Conclusion 

As Thompson (2008) warns, outliers are not merely evil people who distort all statistics for all 
variables. In some cases, outliers and their effects are more important than the rest of the data, 
and the mean would a better estimate regarding the purpose of our research (e.g. the women who 
were having a healthy life with HIV vims). Osborne, Jason & Overbay (2004) say an unlikely 
case might shed light on an important principle or issue, which might be too valuable to ignore. 

In cases where outliers contaminate the general nature of the data set, robust statistical methods 
are considered as modern and generate more replicable results than classical methods. Given all 
the recent developments in computer technology, it is easier than ever to use robust methods, and 
perhaps it is time for all the researchers to consider effects of the modem methods. 



Robust Statistics 14 



References 

Donoho, D. L., & Huber, P. J. (1983). The notion of breakdown point. In P. Biekel, K. Doksum, 
& J. L. Hodges Jr (Eds.), A Festschrift for Erich Lehmann (pp. 157-184). Belmont, CA: 
Wadsworth. 

Ereeg-Hum, D. M., & Mirosevieh, V. M. (2008). Modem robust statistieal methods. American 
Psychologist, 63, 591-601. 

Mieeeri, T. (1989). The unieorn, the normal curve, and other improbable ereatures. 

Psychological bulletin, 105, 156-166. 

Osborne, J. W., & Overbay, A. (2004). The power of outliers (and why researehers should 

always eheek for them). Practical Assessment, Research & Evaluation, P(6). Retrieved 
from http://PAREonline.net/getvn.asp?v=9&n=6 
Reid, N. (2006). Influenee funetions. In (No Ed.), Encyclopedia of statistical sciences. Retrieved 
Febmary 1, 2009, from 

http://mrw.intersoienoe.wiley.oom/emrw/9780471667193/ess/artiole/essl240/eurrent/pdf 
Rousseeuw, P. J. (1984). Least median of squares regression. Journal of American Statistical 
Association, 79, 871-880. 

Rousseeuw, P. J. (1991). Tutorial to robust statistios. Journal of Chemometrics, 5(1), 1-20. 
Thompson, B. (2008). Foundations of Behavioral Statistics: An Insight-Based Approach 
(Paperbaok ed.). NY: Guilford. 

Wiloox, R. R. (1998). How many disooveries have been lost by ignoring modem statistieal 
methods?. American Psychologist, 53, 300-314. 

Wiloox, R. R. (2005). Introduction to robust estimation and hypothesis testing (2nd ed.). San 
Diego:CA: Aoademio Press. 




